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Abstract 

Let (W, W') be an exchangeable pair of vectors in R fc . Suppose this pair satisfies 

P(W'|W) = (J fc - A)W + R(W). 

If ||W — W'||2 < K and R(W) = 0, then concentration of measure results of following form is proved 
for all w y when the moment generating function of W is finite. 

P(W t w), P(W ^ -w) < exp 

for an explicit constant v\, where y stands for coordinate wise > ordering. 

This result is applied to examples like complete non degenerate U-statistics. Also, we deal with the 
example of doubly indexed permutation statistics where R(W) 7^ and obtain similar concentration 
of measure inequalities. Practical examples from doubly indexed permutation statistics include Mann- 
Whitney- Wilcoxon statistic and random intersection of two graphs. Both these two examples are used in 
nonparametric statistical testing. We conclude the paper with a multivariate generalization of a recent 
concentration result due to Ghosh and Goldstein [B] involving bounded size bias couplings and a simple 
application. 



1 Introduction 

Stein's method for normal approximation was devised to obtain rates of convergence in central limit theorems. 
Exchangeable pairs (W, W) satisfying the linearity condition 

E{W'\W) = (1 - \)W for some A € (0, 1), 

are often useful for obtaining Kolmogorov distance bounds between the distribution of W and standard 
normal distribution using Stein's method. The reader is referred to |17j for further details. This condition 
was generalized in FIB, to include a remainder term, 

E{W'\W) = (1 - X)W + R(W), (1) 

for some measurable function R(-). Using fT]), the authors obtained rate of convergence in the central limit 
theorem for weighted U statistics and antivoter model. Although this condition is quite general, obtaining 
a usable closed form expression for the remainder term R(W) can be challenging. 
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Recently Reinert and Rollin [2] proposed a multivariate formulation of ([1}. In particular, suppose it is 
possible to construct an exchangeable multivariate tuple (W, W) € M. k x M. k so that the following relation 
holds for some matrix A and R : R fe — >• Mr, 



E(W'\W) = (I k - A)W + R(W). 



(2) 



Under ([2]), the authors obtain bounds in normal approximation for a rich class of smooth and nonsmooth 
test functions of W. 

Raic [13] , Chatterjee [3] and Ghosh and Goldstein |6] obtained concentration of measure type inequalities 
obtained using tools from Stein's method. Raic used the idea of Cramer transform while Chatterjee used 
a generalized version of exchangeable pairs. Ghosh and Goldstein [B] obtained concentration results for 
centered and scaled positive random variables using size biased couplings. In this paper we will obtain some 
new concentration of measure results under the framework of ^ . A general concentration result is contained 
in Theorem 12.11 for R(W) = 0, while the case of doubly indexed permutation statistics is also handled later 
although it does not satisfy this condition. 

The paper is organized as follows. In Section 2, we state and prove Theorem l2.ll In Section 3, we apply 
Theorem 12. II to obtain concentration of measure results for complete nondegenerate U statistics. In Section 
4, we obtain concentration results for doubly indexed permutation statistics which can not be obtained by 
applying Thcorcm l2.ll The results for doubly indexed permutation statistics are used to obtain concentration 
of measure results for two cases of practical importance, the Mann- Whitney- Wilcoxon rank statistic and the 
random intersection of interpoint distance based graphs, both of which are important in nonparametric 
hypothesis testing. 

2 The main result 

In this section and the following, for a, b € K fe , we define the partial ordering y by 



The definition for ' y' and ' -<' is similar. Also, for any 6 £ R fc , l stands for transpose. The first theorem 
of this paper is stated below. 

Theorem 2.1. Suppose (W, W) 6 K fc x is an exchangeable vector tuple satisfying (0j with R(W) = 
that is 



for some invertible matrix A G Mfc(R), the set of k x k real matrices. Also assume ||W — W'||2 < K for 
constant K. If m{6) = i?(e e * w ) < oo for all 6 e R fe ; then for any w >z 0, 



a y b a, t > b t for 1 < i < k. 



Also, we define the order ^ by 



a -< b <^> b y a. 



E(W'\W) = (J fe -A)W, 



(3) 




(4) 



where v\ = \jo~\ (A) , with o~± (A) denoting the smallest singular value of A henceforth. 
Also the individual coordinate random variables satisfy the following inequalities 




(5) 
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Remark 2.1. If exact value for v\ is not available, we can use upper bounds on V\ instead. For example, 
since 



, detjA'A) 2 

" l(A) - traced A)*-i = l ' (6) 



we obtain l/l > v\. Thus we obtain that the right hand side of can be bounded by exp(— (Z| |w| \^)/2K 2 ). 

Before we begin the proof, we note the following inequality which follows by convexity of the exponential 
function 

e ty+a-t)x dt < / ( te y + (1 _ t ) e *) dt = e + e for all a; 7^ y. 



Hence 



f - yi 

Next we give the proof of Theorem 12.11 
Proof. The gradient vector of m(0) is given by 



\e ax - e c *y\ < \a\(e ax + e ay ) 



d{m{6)) 



Using ([3]), we obtain, 



Vlll{e) = ( ^-^j =£(W/ W ). (8) 



Vm(fl)=£(We 9 ' w ) = £((W - W')e e * w ) + £(W'e e ' w ) 

= #((W- W')e etw ) + J B(i;(W'|W)e etw ) 
= E((W - W')e e * W ) + (4 - A)E(We e ' w ) 



Changing sides we obtain 



AE(We° tw ) = E((W- W')e e ' w ). (9) 



Since (W, W) is exchangeable, we have 

E((W - W')e etw ) = E((W' - W)e etw ') = -E((W - W')e e ' w '), 

implying 

B((W - WV* W ) = ~£((W - W')(e etW - e etw ')). (10) 



Using ((9]) and (jTUJ) , we obtain 



A£(W/ W ) = ±E((W-W')(e et ™ -e 6 *™')). (11) 



Premultiplying both sides by A 1 , we have 



i?(We e,w ) = -^(A-^W - W')(e e ' w - e etw ')). 
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Using (J5J and ([7]), we obtain 

||Vm(fl)|| 2 = ||£7(We fltw )|| a = ^^(A-^W - W')(e etw - e* tw '))|| 2 

< I^IIA-^W-WOIhle^-e^'l) 

< l^dlA-^UHW - W'|] 2 |e etw - e etw '|) 

< ^(nA-^lallW-W'llal^OV-WOKe^ + e *^)), (12) 

where, in the above calculations, for any matrix A 6 Mfc(K), ||A|| 2 is the spectral norm of A that is 

|L4|| 2 = sup ||^x|| 2 = a|(^), 
x e B fc 

I|X||2 = 1 

where Ai < A 2 < • ■ ■ < A^ denote the eigenvalues of A 1 A. Denoting A~* = (A -1 )* and using (A*)^ 1 = (A^ 1 )*, 
we have 

HA- 1 ^ = A^A-'A- 1 ) = A|((AA t )- 1 )) = 1/(aJ(AA*)) = l/a^A) = v x . 

Hence, using Cauchy Schwarz inequality, exchangeability of the tuple (W, W) and ||W — W'||2 < K, (fl2|) 
yields 



||Vm(0)|| 2 < ^^S(||W-W'||i(e ew + e ew )) 



V(log(m W) ) = 

m{0) 



||V(log(m(0)))|| 2 < (14) 

Hence, using m(0) = 1 and the mean value theorem on log(m(0)), we have 

log(m(0)) = V(log(m(z))) • 6 < ||V(log(m(z)))|| 2 ||0|| 2) (15) 

where z g M. k is a vector in the line segment joining to 9. Since (TlH) holds for any arbitrary € K fe and 
for z in particular, (|15[) yields 

log(mW) < «aHISL| W |, < *Msl. 



Since 



we obtain, using (fT3 



Hence 

m(0) < exp ( - ) . (16) 



K 2 ^\\e\\2 



Hence, for arbitrary w y 0, fixed, for any 0^0, 

P(W t w) < P{9 t W > 0*w) < e" e * w m(6/) (17) 




We can minimize each term in the product in the right hand side of (fT8f individually. Using 6i — Wi/{K 2 v{) 
in (|T8|) . wc obtain 




The other inequality for P(W ^ — w) is also derived similarly by considering 9 ■< 0. 

Coming to the inequalities for the individual coordinates, take 9 = (0, . . . , 9i, . . . , 0) that is zero in all 
coordinates leaving the ith one. Then we obtain 

P(Wi > Wi) < e- e * w >E{e e * w >) = e- 0iWi m(9) < e~ 6iWi exp 

Letting = Wi/(K 2 i/i) as before yields ([S]). The left tail bound is similar. □ 



3 An application from U-statistics 

Let X = (Xi, X 2 , ■ ■ ■ , X n ) be a vector of i.i.d random variables and ip : R d — >• M be a measurable and 
symmetric function and Eij)(X\, X 2l ■ ■ ■ ,Xd) — 0. The complete non standardized U-statistics of degree d 
corresponding to the kernel function ip is given by 

U d {TL)= ^(X jl ,X h ,...,X jd ). 

l<jl<j-2<-<jd<n 

For 1 < k < d, we define following the notations in [TS] 

ipk(Xi,X2, • ■ • , Xk) — Ei/;(Xi,X2, . . . , Xk,Xk+i,Xk+2-, ■ ■ ■ , Xd\Xi,X2, ■ • ■ , Xk)- 
If j = {ji,j2, jk} C {1, 2, . . . , n}, we define 

Tp k (j) = MX jlf X j2 ,...,X jk ) 
and the corresponding non standardised U statistics is defined by 

(7 fc (x) = Mi)- 

Clearly, UdiX) is the complete nonstandardised U-statistics corresponding to ip. U-statistics were intro- 
duced in jH] and arise naturally in nonparametric statistics. Rinott and Rotar [TB] used Stein's method of 
exchangeable pairs to obtain Kolmogorov distance bounds to normal distribution for weighted U statistics. 
In [TUl [T] concentration of measure results were obtained. While the results in [TU] apply to U-statistics of 
order two only, the results in [T] are very general although applicable to degenerate U-statistics only that is 
the case when P{ipi{X{) = 0) = 1. In the present section, we will obtain concentration of measure results for 
non degenerate U-statistics and thus will be working with the assumption P(ipi(X\) = 0) < 1 henceforth. 
We will be working with another restriction ||t/'||oo < b. 
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Let us consider the following standardised U statistics for i = 1, . . . , d 

-l 



Wi = n 2 



Ui(X). 



It has been shown in [TT] that varWi x 1 and furthermore in [TS] it was shown that we can embed W4 in a 
vector W so that ([3]) holds. An application of Theorem 12.11 then yields the following result. 

Theorem 3.1. Let Xi, X2, ■ ■ ■ , X n be a collection of i.i.d variables. Suppose ip : M. d — > K is a symmetric, 
measurable function so that \\ip\\oo < b. Assume E%p(X\, X2, ■ ■ ■ , Xd) = and P(E(tl>(Xi, X2, . . . , A"<i)|Xi) = 
0) < 1. IfWd denotes the U-statistics 



W d = TV- 



\j\=d 



then Wd satisfies 



whe 



P{W d > t),P(W d < -t) < exp ( -—^ 



Id 



d(d+ l)(2d + 1) 
6 



and Kd 



(d(d + l)(2d+l)) 



d-l • 



Proof. Let X'^X'^,..., X' n be n independent copies of X±, . . . , X n . Suppose X 1 = (Xi , X2, ■ ■ ■ , X- , X i+1 , . . . , X n ) 
that is we substitute the zth coordinate with an independent copy of Xi. Define 

4(S) = ip k (xi i ,xi 2 ,...,xi k ), 

that is ipk applied on the sample with i-th coordinate exchanged. Pick an index / uniformly at random from 
1,2, ... ,n and consider the U statistics defined as 



v k = Yl ^0) and w 'k = 

iji=fe 



u' k . 



It is clear that (Wd, W' d ) is an exchangeable pair, although they do not yield the univariate linearity condition. 
It has been shown in [15] that with W = (Wi, W 2 , . . . , W d ) and W = (W{, W£, . . . , W' d ), the multivariate 
Stein condition ([3]) holds with the lower triangular matrix 



/ 1 



A = - 

n 



-2 2 

-3 3 





V 



-d d J 



(19) 



Clearly f/'fcCi) = iplQi) if I ^ j. Since Halloo < b, we therefore obtain, 

\Mi)-^l(i)\ <26i(/ej). 

Using {20]) and |{j : j 9 I}| = we have 

/ n- 1 



|tffc-E£| <X>k(j)-^(i)l < 2fo 



fc- 1 



(20) 



(21) 



G 



Hence we obtain 



\W k -Wi\=n^\ \U k -U' k \ 





-1 / 

In — 




G) 




I) 




2 %- 





The bound above readily yields 



||W-W'|| 2 <26n-i ^ (d+1) 6 (2d+1) j'=26 n -* 7rf . (22) 

Using (|2"21 and (fK)]) . we can apply Theorem l2.il with K = 2bjdn~^. Next, we have to obtain lower bounds 
on the singular values of A as in (fT9|) following Remark |2. II It is easy to see 

tr « ( A«A, - L (, + _ _1 («*±iM±i) _ , i™±i) . (23) 

Also, 

Det(A'A) = Det(A) 2 = {d\f n - 2d . (24) 

Suppose < <7i < (T2 < • • • < cr^ denote the k singular values of A in order. Using ©, (1231) and (|2~4"|) we 
obtain 



, 4 {d\) 2 (3n 2 ) d - 1 

^ A ^ mUhrw^ =Kdn (25) 

1/2 

Hence with v\ — l/cri(A), we obtain v\ < n d n. Thus, using Theorem 12. II we obtain our result. □ 



4 Doubly indexed permutation statistics 

Let A — {ai,j t k,i '■ 1 < < n} be a collection of real numbers such that a%^h,l = whenever i = j 

or k — I, ciij.k.i — a i,j.i,k — a j,i,i,k and k^i a i,j-k.i — 0. We consider the doubly indexed permutation 

statistic 




l<s^t<n 

where n is a permutation chosen uniformly from S n , the symmetric group of order n. For notational 
simplicity, we will borrow the notation a,ij^(k),Tv(l) — a ij k i f rom [14j . so that 

Vx= J2 a Zt,.,t- (26) 

l<s=£t<n 

These statistics are natural in several nonparametric hypothesis testing problems in statistics. For example, 
the Mann- Whitney- Wilcoxon signed rank statistic [12] which tests for the equality of distributions of two 
sets of data or the multivariate graph correlation statistic due to Friedman and Rafsky [4j [5] which tests 
whether there is significant correlation present among two sets of multivariate vectors. In these cases one is 
typically interested in obtaining the p-values for V\ under the null distribution. 

In |14l 118] , the authors obtained bounds for the error in normal approximation of V\ using exchangeable 
pairs and Stein's method. We will be using the exchangeable pair obtained in [14] to prove the following 
theorem. 
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Theorem 4.1. Let dij t k,h 1 < hj>k,l < n be a collection of real numbers so that a,i,j,k,l = if i — j or 
k = I, J2i.j,k,i a id:k : i = and a ttJ:k: i = a it j,i t k = Hhhk f or al1 If sup i j k l \a itj ,k,i\ < b, then with V% 

as in H26\) , W\ — n~ 3 / 2 Vi satisfies the following concentration inequality for all t > 0, 



PiWxK-tlPiWt >t) < exp --— , (27) 

V 2( Pb,nJ 

where <j>^ n = (8(2n - 1)6 2 (6 + 4/n + \/n 2 ))/n. 

Proof. We will first construct an exchangeable pair (Vi, V{) and equivalently (W±, W[) where W{ = n~ 3 ^ 2 V{ 
and then construct the pair (W, W) satisfying @. Suppose Tij denotes the transposition of i,j that is 

Tij(k) = k for all k ^ i,j and t< = j, Tij(j) = i. 

To construct the exchangeable pair, we select two distinct indices /, J uniformly from {1, 2, . . . , n}. Letting 
7r' = tttj j, we denote 



s,t=l 

Let V = {Vi,V 2 ,V 3 ) and V = (V{, Vi, V£), where 

n n 

Vi — , > and V7 = a'*' < for i = 2, 3, where 

s=l s=l 
(2) _ 1 , n (3)_l>sp _ (2) 



n * — ' ' n 



The last equality above implies Vi — V3 and V{ = V3. It has been shown in 114] that the tuple (V, V') 
satisfies 

£(V'|V) = (J 3 -A)V + R', (28) 

where R' = (#1,0,0), with 

2 2 2 

^! = 7 TT /_> a L'»i.» = ? 7T / , a L\i,7 = 7 TT^i' ( 29 ) 

n(n — 1) •* — ' n(n — 1) * — ' n(n — 1) 



and 



1 -1 



A = — ^— I 1 I . (30) 

n-l\ Q j 

Using (28]), we obtain (W, W) = n- 3 / 2 (V, V) satisfies 

£(W'|W) = (J 3 - A)W + R, (31) 
where A is as in J3Q]) and R = n~ 3/2 R'. 



8 



Next we bound ||W — W'||2 and v\ — a x 1 (A). First we bound ||W — W'Ha. It is easy to verify that 

n 

Vi -Vl = - ^( a /, S ,/,S + a7 J,S,J,S + "ll.sj + a S,J,S,j) 

s=l 

■ 1. 1 + "i.j.,. + "j.j.j.j + "l,..i.r- 

n 

+ 5Z( a /,s : J,s + a J,sJ,s + a s,J,s,J + a s,J,s,j) 
s=l 

— ( a 7,/,J,J + a 7,J,J,/ + a J,I,I,.J + a J.J,I,l) 
n 

= - 5I( a /,sJ : s + a ls,J,s + a s,I,sJ + a s,J,s, j) 



s=l 



s=l 



and also 



+( a J,J,i,j + a J,/,Jj) - ( a 7,J,J,/ + a J,/,/,j). ( 32 ) 

^ - ^3 = ^' - 7 2 = -ag (/) - a% {J) + ag (J) + a« (J) . (33) 

The equalities in (j3"2")l . (j3"3")l along with the facts that |eit j,fei| < b and \a^}\ < bn, for all 1 < s, t < n give 
\V{-Vi\<8bn + 4b and \V( - Vi\ < 4bn fori = 2, 3. 

Thus we obtain 

||V- V'|| 2 < ((86n + 46) 2 + 32&V)^ = 46(6n 2 + 4ra + l)i 
Since (W, W) = «- 3 / 2 (V, V), we obtain 

||W- W'|| 2 < 46n" 3/2 (6n 2 +4n + 1)5 = 4b n - 1/2 (6 + 4/n + l/n 2 ) 1/2 :=%,„, say. (34) 
Next, we need to bound i^. As in Remark |2. 11 we first obtain det(A*A) and trace(A*A). 

det(A«A) = det'(A)=f^^V and trace(A*A) = ( * V + 4) < 32 



\(n — l) 3 ny \n— 1/ \ n 2 / (n — l) 2 

Using Remark l2.11 we obtain 

2 dct(A'A) / 2n-l \ 2 

Ul[ > - trace 8 (A*A) " ^4n(n- 1) / ' 

Hence, with v\ — cr-j" 1 (A) as in Theorem 12. 11 we obtain 

vx < — K - < 2n. 35 

2n — 1 

As in the proof of Theorem 12. 11 we consider m(6) = E(e e w ) for 6 £ M 3 . The gradient vector is given by 



d(m(0)) x fc 



Using (j3"T1) . we obtain, 

Vm(fl)=£(We 8 ' w ) = £((W - W')e e ' w ) + E(W'e etw ) 

= E((W- W')e etw ) + E{E(W'\W)e e * w ) 

= E{{W - W')e° tw ) + (I 3 - A)£(We 9 ' w ) + £(Re e ' w ). 
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Changing sides we obtain 



A£(We e ' w ) = E((W- W')e etw )+E(Re etw ). 
Since (W, W) is exchangeable, we have 

E((W - W')e etw ) = E((W - W)e e * w ') = -E((W - W')e e ' w '), 

implying 



E((W - W')e etW ) = -E((W - W')(e etw - e 6 ^')). 



Using (|36|) and (|37|) . we obtain 

A£(W/ W ) 
Premultiplying both sides by A -1 , we have 



-E((W - W')(e etw - e etw ')) + £(Re etw ). 



E(We 



e*w\ 



1 



^(A-^W- W')(e 



)) + ^(A-WW) 



Equating the first coordinates of the vectors on the two sides of (j3"9")) , we obtain 

£(^ ie etw ) = ^([A-^W - W')]i(e etw - e etw ')) + EdA^R.]^™), 

where for a vector X, [X]i := X\ or the first coordinate. Since, 

.- r i_T(W'i l 0,0)* and Ar^^^r, 
n(n — 1) : 2(2n — 1) 

we obtain, 

2Wi Wi 



R 



[A-^^-A-J 



n(n — 1) 2n — 1 



Thus, (00} now yields, 



£(Wie e w ) = -^([A-^W - W')]i(e' 



*w_ e e*w' )) 



1 



E(Wie 0W ) 



In - 1 



Changing sides, we obtain 
2n 



2n-l 



E{W x e e ^) = \e ([A-^W - W')]i(e et 



(36) 



(37) 



(38) 



(39) 



(40) 



(41) 



As before, note that ||A 1 \\2 = v\. Taking absolute values on both sides of (|41l) and using (l34l) and Jensen's 
inequality, we obtain 



\E{W x e et ™)\ = 



< 
< 



2n- 


1 


4n 




2n- 


1 


4n 




2n- 


1 



< 



E ([A- X (W - W')]i(e etw - e etw ') 
^(llA-^W-WOIb 
sfllA^llallW- W'||a 



An 

(2n - l)r?6,„fi 



e e l w _ e l w' 



e e'w _ e 8'W 



4?? 
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Taking = (6*1,0,0)' and using ([7}, we obtain 



\E(Wie° lWl )\ < y;u,n-. E 

4n 



< 



4/7 



< (2n-l) V l n \6 1 W lE (e^) + E(e s ^) (2n - l^Jfr^ 

~ An 2 An [G [ ' 

Using (02J and ([51]). we obtain 

l^e^,), < 4(2n-l)6 2 (6 + 4/n + l/n 2 )|g 1 |. l i?(c9iWi) ^ 
The bound from (|55|) yields 

. ie 9 lWl) | < 8(2n - l)b 2 (6 + 4/n + Vg^N 



|£(w : 

Hence, with m 1 (0 1 ) = i?(e 6 ' ll/l/l ), we obtain 



n 

9, Wi 



K(*)| = \E(Wie eiWl )\ < 8(2n-l)6 2 (6 + 4/n + l/, 2 )| gl | mi(gi) = (43) 



It is easy to see that 



E(V 1 )= 1 V a M> „,„ = 0, 



implying m^(0) = £(Wi) = as well. Since mi(8i) is a convex function, we therefore have m'i(0i) > for 
0i > and m'i(6»i) < 0, for 9 1 < 0. 

Using (j43j, we therefore have for 9% > 0, 



mi(0i) < M 6>imi(6>i), 

which on integration, yields 



log(mi(0i)) < for0i>O. 
Similar argument holds for 0i < as well, yielding 

mi(0i) <exp^^T) for all 0i. 

Using Markov's inequality, we have 

P(Wi > i) < e" 9l 'mi(0i) < exp f-0 x t + for all 9 1 > 0. 

Using 0i = t/tfit,^, we obtain 

P(Wi >t)< exp 



20b,, 

The bound for P(Wi < -t) is similar. □ 



Next we discuss two applications of Theorem 14. II to distribution free hypothesis testing. The first one is 
Mann- Whitney- Wilcoxon signed rank statistic, while the second one is the generalised multivariate correla- 
tion measure due to Friedman and Rafsky. 
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4.1 Applications to Mann- Whit ney-Wilcoxon statistic 



Let Xi,X2, - ■ ■ ,x ni and y 1,1/2, ■ ■ ■ ,Vn 2 , n i + n i = n be independent univariate samples from unknown con- 
tinuous distributions Fx and Fy respectively. One is interested in testing the hypothesis 

H :F X = F Y vs. H X :F X ^ F Y . 

The MWW test statistic is defined as 

Vmww = ■ x t < Vj }\. (44) 

We reject H if Vmww is too large or too small, see [12]. The rate of convergence to normality for Vmww 
was considered in [T5] and [14] . Let z = (xx, X2, ■ ■ ■ , x ni , y±, y2, ■ ■ ■ , y n2 ) an d denote the rank of z,. 
Under H , it is clearly a uniform random permutation. For 1 < i,j, k,l < n, define 

!+~ if 1 < i < m, m + 1 < j < n and 1 < k < I < n 

— ~ if 1 < i < Tii, ni + 1 < j < n and 1 < i < fc < n (45) 

otherwise. 

Since 

V i = ^2 a s,t, s ,t = ^(Hx s < y t _ ni ) - l{x s > y t - ni )) 

s^t l<s<ni ,ni + l<£<n 

= 2^ MW — o '■ ni712 — Vmww) 

— ^MIVW ^ — ' 

and j k i a i,j-k.i = 0, we obtain that V\ is Vmww mean centered and hence instead of evaluating the 
p values of Vmww under Hq, we might as well obtain the same for Vi . Since di.j,k,i in (|45[) satisfies the 
hypothesis of Theorem 14.11 we can apply Theorem 14. 1[ to bound the p values of V\. In particular, using 
b = 1/2 in Theorem 14. 11 we obtain the following proposition. 

Proposition 4.1. Let x\, x%, ■ ■ ■ , x ni and y\, y2, ■ ■ ■ , y n2 > ^1+^2 = n be independent univariate samples 
from unknown continuous distributions Fx and Fy. Let aij.k.i be defined as in fr45\ l. If n is a permutation 
chosen uniformly at random and 

Then W\ = n~ 3 ^ 2 Vi satisfies the following inequality for all t > 

t 2 n 



P(W / i > t), P(W / i < -t) < exp 



4(2n- l)(6 + 4/n + l/n 2 ) 



4.2 Random intersection of interpoint distance based graphs 

In [4] and [5], notion of association measures like Kendall's r were extended to multivariate observations 
using interpoint distance based graphs. Let {X\, Y{), (X 2 , Y 2 ), . . . , {X n , Y n ) be n i.i.d vector tuples. We are 
interested in examining the strength of association between X and Y . This is achieved by constructing k 
minimal spanning trees or k nearest neighbour spanning subgraphs G\ and G2 out of the X and Y datapoints 
respectively. If E% denotes the edge set of Gi for i = 1, 2, then the statistic of interest is 

r 1= i((i,j)e£a)i(i,je£ 2 ). 

l<z,j<n 
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Clearly, a large value of Ti indicates presence of significant association between X and Y. For notational 
simplicity, let ai.j t k,i = Cijdk,i, where Cjj = <E E\) and df-,i = l((k,l) <G -E 2 ). We need to compare 

the observed value of Ti with the baseline p value of V\ where 

* = E <t,s,t = E !(( s ' *) e G i) 1 ((^( s )' e G 2 ), (46) 

where 7r is a permutation chosen uniformly at random from $„. Clearly 

4|£i||£ 2 | 



/x = -E(Vl) = ; 1 7T E a ( 
n n — 1 r — ' , 



nfn — 1) ' IJ ' ' nfn — 1) 



H,j,k,l 

Hence, if we consider 

a i,j,k,l 



i,j,k,i ~ n'^-f'i if * ¥= 3 and A; ^ I 
otherwise. 



then . fe l ai,j,fc,i = and the array a,ij t k,l 1 < < n satisfies the conditions in Theorem 14.11 Since 

l-Eal) l-E^I < ra(n — l)/2, the number of edges in the complete graph on n vertices, we obtain 

4|£ 1 ||£ 2 | 

Pi.J.feJ S a i,j,k,l H 57 TTo S -<S- 

n z (n — 1)^ 

Hence applying Theorem 14.11 with 6 = 2, we obtain the following proposition. 

Proposition 4.2. Lei Gi = (Vi,£i), G 2 = (V^-E^) 6e two interpoint distance based graphs derived from 
n data points X±,Xz, . . . ,X n and Yx,Y%, . . . ,Y n respectively. Let n be a permutation chosen uniformly at 
random from S„. Then V\, as defined in (Jty ) satisfies the following concentration inequality 

l^ill^lVA D ^_3/2^ A \Ei\\E 2 \\^. / ne 



64(2n- l)(6 + 4/n+ 1/n 2 ) 

5 Size biasing and multivariate concentration inequalities 

Let W = (Wi, W2, ■ ■ ■ , Wk) € R fc , be a random vector with nonnegative coordinate variables. In [6], 
concentration of measure inequalities were obtained for positive random variable W with positive mean fj, 
and nonzero variance a 2 under a boundedness condition on the coupling (W, W s ), where W s denotes the 
size bias transformation of W, that is, it satisfies the identity 

E(Wf(W)) = fiE(f(W s )) for all functions / so that E(Wf(W)) is defined. 

In this section, we will derive a multivariate analogue of the same result. For W in consideration, assume 
Hi > for all i = 1, 2, . . . , k. The W size biased variate in direction i denoted by W 2 is defined as the 
random variable having distribution dF l with 

X ' 

dF l (x!,X2, ■ ■ -,x n ) = —dF(x 1 ,x 2 , ■ ■ .,x n ), 
Hi 

where W ~ dF. The random variable W l thus defined satisfies 

£J(W i /(W))=/x^(/(W i )), 
for all functions / where the above expectations are finite. In particular 

E(W l e e ^)^^E{e etw ). (47) 
For notational purposes let us define for any two vectors 0, <fi € R fe 



9 

4> 



n 02 ok 

h ' 02 ' ' 4>k 
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Theorem 5.1. Suppose W = (Wi, W 2 , ■ ■ ■ , W&) is a random vector with nonnegative coordinate variables, 
with fj,, a >- 0. Suppose ||W— W'l^ < K for some constant K for alii = 1,2, ... ,k. Ifcrm — niiiii = i i 2,...,Ai 
then for any t ^ 0, we have 



p (^L H t j ,-xp ( 



Itll 2 
I l ll2 



where 



2{K 1 +K 2 \\t\\ 2 )J ' 
K 



Ki = - — 1| — 1|2 and K 2 



(1) 



Proof. Using (JJj), we obtain for any i, 

E(e etw ')-E(e etw )<\E(e etw ')-E(e etw )\ < E 



|0 f (W l -W)|(e e ' w! +e e ' w ) 



< £ 



||0|| 2 ||W l -W|| 2 ( 



< 



A"||e|| 2 B(e 



9'W , g0 *W) 



Changing sides, we obtain for ||0||2 < 2/K, 

E(e etw ') < 

Hence from ((4"?|). we obtain, for ||0|| 2 < 2/if 
<9to(#) 



1 - 



K\\e 



2 E(e 9tw ). 



E(W i e 9tw ) = n l E(e"" w ~)<n 



dOi 



1 



1 - 



K\\e\ 



-E(e 



2 + K\\e\\ 2 

fM 2-K\\9\\ 2 m{d) - 



Denoting M{6) = £ (exp (0* • ((W - n)/cr))), we obtain 



Hence denoting 



M(0)=m(^J e- et ^< 
dm(8) 



dOi 



6=f3 



for (3&R k and using (gSJ and (011), we obtain, for ||0/er|| 2 < 2/if, 



08, 



dM(d) 1 /e\ / tM \ ^ / flt M\ 

= — dim — exp — & — m — exp — & — I 



'-Mid) 



2 + K\\0/ct\ 
2-K\\6/a\ 



o~i V '2-K\\6/(t\\ 2 



Since (JSDJ) holds for alH = 1, 2, . . . , k, we obtain, for all ||0/<r|| 2 < 2/K 



\mMm\u<\\^M(ey 2 _ Kmah 



2K\\8/*\\ 2 <ul i 



iM{8) 



2K\\e\\, 



a {x) {2-K\\e/a\\ 2 y 



(48) 



(49) 



(50) 



(51) 
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Continuing as in the proof of Theorem O flSTJl yields that for all 6eR with ||0/<t|| 2 < 2/K, we have 
||V(l„ g (M( e) ))|| 2 = «k < ,| 2 . 



M{6) -"cT"a {1) {2-K\\e/a\\ 2 y 
Using the mean value theorem, for all < € M fe with ||<?/<t|| 2 < 2/K, 

log(M(0)) - V(log(M(z))).0, 
for some ^ z ^ 0. Hence ||z/tr|| 2 < ||0/tr|| 2 < 2/Jl, 

|log(M(0))| < ||V(log(M(z)))|| 2 ||0|| 2 < ||£|| a J Kl $}\,, , \\e\\». (52) 

Note that 

1 2 

2 < ^ => - 2 < 

A 2 IT A 

Since < z X 0, if | |0| \ 2 < 1/K 2 , (J52J) yields 
Hence if > and ||0|| 2 < 1/K 2 , we obtain 

P t .) fi P a *) S .-- W s exp + IT «a_) ■ (53, 

Using = t/(ifi + A" 2 ||t||2) hi |[55J). we obtain 

fW-u \ ( ||t||| 

P ^ t < exp ' M ~ 



v t - ; - v 2(A' 1 +A' 2 ||t|| 2 ) 



□ 



6 An application 

Let ti,t 2 G S* m be two fixed permutations from SVm the permutation group on m elements. Let 7r be 
a permutation selected uniformly at random from S n , where n > m. We consider the bivariate random 
variable W = (Wi,W 2 ) where W\ counts the number of times pattern t\ appears in 7r and W 2 counts the 
number of times t 2 appears in tt. Concentration of measure inequalities for W\ has been obtained in [7]. 
Using Theorem I5.1[ we can in fact obtain concentration bounds for (Wi, W 2 ). 

To fix notations, for n > m > 3, let tt and t be permutations of V = {l,...,n} and {l,...,m}, 
respectively, and let 

V a = ex. + 1, . . . , a + m — 1} for a £ V, 

where addition of elements of V is modulo n. We say the pattern r appears at location a € V if the values 
{7r(v)}t, e v a and {r(w)}^ e vi are in the same relative order. Equivalently, the pattern r appears at a if and 
only if 7r(r _1 (y) + a — 1), v <E Vi is an increasing sequence. When r = t m , the identity permutation of length 
m, we say that 7r has a rising sequence of length m at position a. Rising sequences are studied in [5] in 
connection with card tricks and card shuffling. 

Letting tt be chosen uniformly from all permutations of {1, . . . , n}, and X a ^ T the indicator that r appears 
at a, 

X a , T (ir(v), veV a ) = 1(tt(t- 1 (1) + a - 1) < • • • < 7r(T -1 (m) + a - 1)), 
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the sum W = X^ogv X a ,r counts the number of m-element-long segments of tt that have the same relative 
order as r. 

Let a a be the permutation of {1, ... , m} for which 

7t(ct q (1) + a — 1) < • ■ ■ < ir(a a (m) + a — 1). 



n(a- a (Ti{v — a+l)) + a — l), veV a 



In other words nf is the permutation n with the values 7r(u), v € V a reordered so that 7Ti (7) for 7 G V Q 
are in the same relative order as t\ . Similarly we can define ir% corresponding to T2 . 

To obtain W l , the W size biased variate in direction i for i = 1,2, pick an index /3 uniformly from 
{1,2,..., n} and set W] = J2 a ev {^). Then W 1 = (W{, W$, for i = 1, 2. 

The fact that we indeed obtain the desired size bias variates follows from results in [5]. Since both 7rf 
and agree with it on all the indices leaving out Vp and |V^| = to, we obtain — Wj\ < 2m — 1 for 
i,j = 1,2. Hence, ||W- W J || 2 < (2m - l)\/2 for i = 1,2. 

For r € SVn, let /fe(r) be the indicator that r(l), . . . , r(m — k) and r(fc + 1), . . . , r(m) are in the same 
relative order. Following the calculations in [7], we obtain 



/ij = .E(Wi) = —7 and ^=var(Wi) = « A 1 " + 2 E 



to! ' I to! \ ml J ' (to + fc)! 

Since < i& < 1, the variance lower bound is obtained when 7^=0 yielding 

9 n / 2m — 1 
^toT' 1 



Since, the constants K\ and K2 Theorem [5T] can be replaced by larger constants, we can apply it with 

(8m - 4)m! , (2m - l)m! 

K\ = — : — and K2 ~ 



ml - 2m + 1 v / 2n(ml - 2m + 1) ' 

to obtain concentration inequality for W = (W\, W-i). 
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